AUDD: Audio Urdu Digits Dataset for Automatic Audio Urdu Digit Recognition
نویسندگان
چکیده
The ongoing development of audio datasets for numerous languages has spurred research activities towards designing smart speech recognition systems. A typical system can be applied in many emerging applications, such as smartphone dialing, airline reservations, and automatic wheelchairs, among others. Urdu is a national language Pakistan also widely spoken other South Asian countries (e.g., India, Afghanistan). Therefore, we present comprehensive dataset digits ranging from 0 to 9. Our 25,518 sound samples that are collected 740 participants. To test the proposed dataset, apply different existing classification algorithms on including Support Vector Machine (SVM), Multilayer Perceptron (MLP), flavors EfficientNet. These serve baseline. Furthermore, propose convolutional neural network (CNN) digit classification. We conduct experiment using these networks, results show CNN efficient outperforms baseline terms accuracy.
منابع مشابه
Automatic Speech Recognition of Urdu Digits with Optimal Classification Approach
Speech Recognition for Urdu language is an interesting and less developed task. This is primarily due to the fact that linguistic resources such as rich corpus are not available for Urdu. Yet, few attempts have been made for developing Urdu speech recognition frameworks using the traditional approaches such as Hidden Markov Models and Neural Networks. In this work, we investigate the use of thr...
متن کاملAutomatic Diacritization for Urdu
Urdu language is written in Arabic script. In this script, the consonantal context is clearly represented, but the vocalic sounds are represented (mostly) by marks or diacritics, which are optional and normally not written. Readers can guess the diacritics and thus can pronounce words correctly, based on their knowledge of the language. But un-diacritized Urdu text creates ambiguity for novice ...
متن کاملUrdu Qaeda: Recognition System for Isolated Urdu Characters
This paper presents an online system for recognizing isolated, hand-sketched Urdu characters drawn on a Tablet PC. Attributes of Urdu characters are analyzed to define a set of features which are then trained and classified using a weighted, linear classifier. As a proof of concept, we have integrated our recognition algorithm into an application used to help people learn the Urdu language. Pre...
متن کاملAutomatic Recognition of Offline Handwritten Urdu Digits In Unconstrained Environment Using Daubechies Wavelet Transforms
This paper presents an optical character recognition system for the handwritten Urdu Digits. A lot of work has been done in recognition of characters and numerals of various languages like Devanagari, English, Chinese, and Arabic etc. But in case of handwritten Urdu Digits very less work has been reported. Different Daubechies Wavelet transforms are used in this work for feature extraction. Als...
متن کاملDWT features performance analysis for automatic speech recognition of Urdu
This paper presents the work on Automatic Speech Recognition of Urdu language, using a comparative analysis for Discrete Wavelets Transform (DWT) based features and Mel Frequency Cepstral Coefficients (MFCC). These features have been extracted for one hundred isolated words of Urdu, each word uttered by ten different speakers. The words have been selected from the most frequently used words of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied sciences
سال: 2021
ISSN: ['2076-3417']
DOI: https://doi.org/10.3390/app11198842